Predicting RNA secondary structures with pseudoknots by MCMC sampling.

نویسندگان

  • Dirk Metzler
  • Markus E Nebel
چکیده

The most probable secondary structure of an RNA molecule, given the nucleotide sequence, can be computed efficiently if a stochastic context-free grammar (SCFG) is used as the prior distribution of the secondary structure. The structures of some RNA molecules contain so-called pseudoknots. Allowing all possible configurations of pseudoknots is not compatible with context-free grammar models and makes the search for an optimal secondary structure NP-complete. We suggest a probabilistic model for RNA secondary structures with pseudoknots and present a Markov-chain Monte-Carlo Method for sampling RNA structures according to their posterior distribution for a given sequence. We favor Bayesian sampling over optimization methods in this context, because it makes the uncertainty of RNA structure predictions assessable. We demonstrate the benefit of our method in examples with tmRNA and also with simulated data. McQFold, an implementation of our method, is freely available from http://www.cs.uni-frankfurt.de/~metzler/McQFold.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting RNA Secondary Structures with Pseudoknots by MCMC Sampling . — preprint —

The most probable secondary structure of an RNA molecule, given the nucleotide sequence, can be computed efficiently if a stochastic context-free grammar (SCFG) is used as the prior distribution of the secondary structure. The structures of some RNA molecules contain so-called pseudoknots. Allowing all possible configurations of pseudoknots is not compatible with context-free grammar models and...

متن کامل

SimulFold: Simultaneously Inferring RNA Structures Including Pseudoknots, Alignments, and Trees Using a Bayesian MCMC Framework

Computational methods for predicting evolutionarily conserved rather than thermodynamic RNA structures have recently attracted increased interest. These methods are indispensable not only for elucidating the regulatory roles of known RNA transcripts, but also for predicting RNA genes. It has been notoriously difficult to devise them to make the best use of the available data and to predict high...

متن کامل

RNA Pseudoknot Prediction in Energy-Based Models

RNA molecules are sequences of nucleotides that serve as more than mere intermediaries between DNA and proteins, e.g., as catalytic molecules. Computational prediction of RNA secondary structure is among the few structure prediction problems that can be solved satisfactorily in polynomial time. Most work has been done to predict structures that do not contain pseudoknots. Allowing pseudoknots i...

متن کامل

Predicting RNA Secondary Structures with Arbitrary Pseudoknots by Maximizing the Number of Stacking Pairs

The paper investigates the computational problem of predicting RNA secondary structures. The general belief is that allowing pseudoknots makes the problem hard. Existing polynomial-time algorithms are heuristic algorithms with no performance guarantee and can handle only limited types of pseudoknots. In this paper, we initiate the study of predicting RNA secondary structures with a maximum numb...

متن کامل

RNAVLab: A virtual laboratory for studying RNA secondary structures based on grid computing technology

As ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation, their secondary structures have been the focus of many recent studies. Despite the computing power of supercomputers, computationally predicting secondary structures with thermodynamic methods is still not feasible when the RNA molecules have long nucleotide sequences ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of mathematical biology

دوره 56 1-2  شماره 

صفحات  -

تاریخ انتشار 2008